11 research outputs found

    HATCH: Hash Table Caching in Hardware for Efficient Relational Join on FPGA

    Get PDF
    In this paper we present HATCH, a novel hash join engine. We follow a new design point which enables us to effectively cache the hash table entries in fast BRAM resources, meanwhile supporting collision resolution in hardware. HATCH enables us to have the best of two worlds: (i) to use the full capacity of the DDR memory to store complete hash tables, and (ii) by employing a cache, to exploit the high access speed of BRAMs. We demonstrate the usefulness of our approach by running hash join operations from 5 TPCH benchmark queries and report speedups up to 2.8x over a pipeline-optimized baseline.The research leading to these results has received funding from the European Unions Seventh Framework Programme (FP7/2007-2013), for Advanced Analytics for Extremely Large European Databases (AXLE) project under grant agreement number 318633, and from the Ministry of Economy and Competitiveness of Spain under contract number TIN2012-34557.Postprint (author's final draft

    An empirical evaluation of High-Level Synthesis languages and tools for database acceleration

    Get PDF
    High Level Synthesis (HLS) languages and tools are emerging as the most promising technique to make FPGAs more accessible to software developers. Nevertheless, picking the most suitable HLS for a certain class of algorithms depends on requirements such as area and throughput, as well as on programmer experience. In this paper, we explore the different trade-offs present when using a representative set of HLS tools in the context of Database Management Systems (DBMS) acceleration. More specifically, we conduct an empirical analysis of four representative frameworks (Bluespec SystemVerilog, Altera OpenCL, LegUp and Chisel) that we utilize to accelerate commonly-used database algorithms such as sorting, the median operator, and hash joins. Through our implementation experience and empirical results for database acceleration, we conclude that the selection of the most suitable HLS depends on a set of orthogonal characteristics, which we highlight for each HLS framework.Peer ReviewedPostprint (author’s final draft

    The restitution process in conservation: discovering the history of Åžehit Ali paÅŸa library

    Get PDF
    Conservation of cultural heritage sites is a multi-phased process including the architectural survey, restitution and restoration. Survey phase begins with the preliminary research and in situ analysis, which are crucial for comprehending the specifications, potentials and architectural characteristic of the site. After architectural survey, restitution is carried out to understand the situation in the first period of the site and how it has undergone a change in the historical process. During the restitution studies, alternatives are prepared for various periods by using the traces on the building and archival documents. At the last stage of the conservation process, which is restoration, the interventions for deterioration, repair proposals and spatial organization are determined according to the new/current use. This study focuses on restitution process of Sehit Ali Pasa Library, which is currently located in the garden of Vefa High School in Kalenderhane Neighbourhood, and consists of four main parts. The first part focuses on general information of the conservation process. In the second part, the historical background, location, spatial organization, construction technique and materials of the building are examined. The restitution or historical analysis process is the main theme of the third part, and all the findings and considerations are evaluated and interpreted in the final part. In this study, all characteristic features and the layers of the cultural heritage are discussed and documented to indicate the importance of architectural survey and restitution interpretation of the conservation process through the Sehit Ali Pasa Library as a multi-layered example. Unfortunately, the findings in the archives or the libraries are limited to propose certain restitution alternatives or precise historical description for this 18th century library, but it is crucial to underline the importance of detailed research process and methodology, architectural survey to prepare a scientific, reasonable, and consistent historical analyses of cultural heritage such as Sehit Ali Pasa Library as a multi-layered and complex building.Publisher's Versio

    A multicore emulator with a profiling infrastructure for transactional memory on FPGA

    Get PDF
    This thesis attempts to bring together two recent topics by presenting a flexible Transactional Memory environment on a multicore prototype that is realized on FPGA fabric. For this, we devise a MIPS-compatible shared-memory multicore emulator with Hybrid Transactional Memory support, based on the Plasma open source soft processor core. We present the design and implementation of the TMbox system, which features an emulation system of up to 16 soft processor cores interconnected with a bi-directional ring bus, running at 50 MHz on a Virtex5-155t FPGA. Additionally, we build the first comprehensive infrastructure to profile Hybrid TM systems, an extensive visualization environment that enables examining complete transactional executions in detail. TMbox is a completely modifiable architecture implementing a multicore prototype with support for STM, HTM and Hybrid TM. It was written in various common design languages, and enables modifying the complete stack, down from the ISA, through the software toolchain, up to the well-optimized parallel code. With such an infrastructure, fast execution and quick performance evaluation can be made possible for studies in computer architecture.En esta tesis se propone juntar dos temas recientes a través de la creación de un sistema de memoria transaccional en un circuito programable (un FPGA). Para ello, ha sido diseñado un emulador de multinúcleo, con soporte para memoria transaccional híbrida y núcleos MIPS basados en el procesador de código libre Plasma. Presentamos el diseño y la implementación del sistema TMbox, que implementa un sistema de emulación de hasta 16 núcleos MIPS interconectados con un bus en anillo bidireccional, funcionando a 50 MHz en un FPGA Virtex5-155t. Adicionalmente, hemos construido la primera infraestructura para perfilado y visualización de memoria transaccional híbrida, que permite ver con un alto detalle ejecuciones transaccionales completas. TMbox es una arquitectura completamente modificable, e implementa un prototipo de un multinúcleo con memoria transaccional en software, hardware e híbrido. Está compuesto de varios lenguajes de diseño muy comunes, y habilita modificar el stack completo, desde la ISA, pasando por el toolchain de software, hasta el código paralelo altamente optimizado. Con una infraestructura como TMbox, es posible la ejecución veloz y la evaluación rápida del rendimiento en investigaciones de arquitectura de computadores.Postprint (published version

    HATCH: Hash Table Caching in Hardware for Efficient Relational Join on FPGA

    No full text
    In this paper we present HATCH, a novel hash join engine. We follow a new design point which enables us to effectively cache the hash table entries in fast BRAM resources, meanwhile supporting collision resolution in hardware. HATCH enables us to have the best of two worlds: (i) to use the full capacity of the DDR memory to store complete hash tables, and (ii) by employing a cache, to exploit the high access speed of BRAMs. We demonstrate the usefulness of our approach by running hash join operations from 5 TPCH benchmark queries and report speedups up to 2.8x over a pipeline-optimized baseline.The research leading to these results has received funding from the European Unions Seventh Framework Programme (FP7/2007-2013), for Advanced Analytics for Extremely Large European Databases (AXLE) project under grant agreement number 318633, and from the Ministry of Economy and Competitiveness of Spain under contract number TIN2012-34557

    From plasma to beefarm: Design experience of an FPGA-based multicore prototype

    No full text
    In this paper, we take a MIPS-based open-source uniprocessor soft core, Plasma, and extend it to obtain the Beefarm infrastructure for FPGA-based multiprocessor emulation, a popular research topic of the last few years both in the FPGA and the computer architecture communities. We discuss various design tradeo s and we demonstrate superior scalability through experimental results compared to traditional software instruction set simulators. Based on our experience of designing and building a complete FPGA-based multiprocessor emulation system that supports runtime and compiler infrastructure and on the actual executions of our experiments running Software Transactional Memory (STM) benchmarks, we comment on the pros, cons and future trends of using hardware-based emulation for research.Peer Reviewe

    Results from the A303 (Wylye) axle weight surveys (1988)

    Get PDF
    SIGLEAvailable from British Library Document Supply Centre- DSC:3425.926(TRRL-CR--210) / BLDSC - British Library Document Supply CentreGBUnited Kingdo

    An empirical evaluation of High-Level Synthesis languages and tools for database acceleration

    No full text
    High Level Synthesis (HLS) languages and tools are emerging as the most promising technique to make FPGAs more accessible to software developers. Nevertheless, picking the most suitable HLS for a certain class of algorithms depends on requirements such as area and throughput, as well as on programmer experience. In this paper, we explore the different trade-offs present when using a representative set of HLS tools in the context of Database Management Systems (DBMS) acceleration. More specifically, we conduct an empirical analysis of four representative frameworks (Bluespec SystemVerilog, Altera OpenCL, LegUp and Chisel) that we utilize to accelerate commonly-used database algorithms such as sorting, the median operator, and hash joins. Through our implementation experience and empirical results for database acceleration, we conclude that the selection of the most suitable HLS depends on a set of orthogonal characteristics, which we highlight for each HLS framework.Peer Reviewe
    corecore